Search CORE

128 research outputs found

Modeling Rare Interactions in Time Series Data Through Qualitative Change: Application to Outcome Prediction in Intensive Care Units

Author: Dobson Richard
Ibrahim Zina
Wu Honghan
Publication venue: 'IOS Press'
Publication date: 03/04/2020
Field of study

Many areas of research are characterised by the deluge of large-scale highly-dimensional time-series data. However, using the data available for prediction and decision making is hampered by the current lag in our ability to uncover and quantify true interactions that explain the outcomes.We are interested in areas such as intensive care medicine, which are characterised by i) continuous monitoring of multivariate variables and non-uniform sampling of data streams, ii) the outcomes are generally governed by interactions between a small set of rare events, iii) these interactions are not necessarily definable by specific values (or value ranges) of a given group of variables, but rather, by the deviations of these values from the normal state recorded over time, iv) the need to explain the predictions made by the model. Here, while numerous data mining models have been formulated for outcome prediction, they are unable to explain their predictions. We present a model for uncovering interactions with the highest likelihood of generating the outcomes seen from highly-dimensional time series data. Interactions among variables are represented by a relational graph structure, which relies on qualitative abstractions to overcome non-uniform sampling and to capture the semantics of the interactions corresponding to the changes and deviations from normality of variables of interest over time. Using the assumption that similar templates of small interactions are responsible for the outcomes (as prevalent in the medical domains), we reformulate the discovery task to retrieve the most-likely templates from the data.Comment: 8 pages, 3 figures. Accepted for publication in the European Conference of Artificial Intelligence (ECAI 2020

arXiv.org e-Print Archive

UCL Discovery

Quantifying Health Inequalities Induced by Data and AI Models

Author: Sylolypavan Aneeta
Wang Minhong
Wild Sarah
Wu Honghan
Publication venue: 'IJCAB Publications'
Publication date: 23/07/2022
Field of study

AI technologies are being increasingly tested and applied in critical environments including healthcare. Without an effective way to detect and mitigate AI induced inequalities, AI might do more harm than good, potentially leading to the widening of underlying inequalities. This paper proposes a generic allocation-deterioration framework for detecting and quantifying AI induced inequality. Specifically, AI induced inequalities are quantified as the area between two allocation-deterioration curves. To assess the framework’s performance, experiments were conducted on ten synthetic datasets (N>33,000) generated from HiRID - a real-world Intensive Care Unit (ICU) dataset, showing its ability to accurately detect and quantify inequality proportionally to controlled inequalities. Extensive analyses were carried out to quantify health inequalities (a) embedded in two real-world ICU datasets; (b) induced by AI models trained for two resource allocation scenarios. Results showed that compared to men, women had up to 33% poorer deterioration in markers of prognosis when admitted to HiRID ICUs. All four AI models assessed were shown to induce significant inequalities (2.45% to 43.2%) for non-White compared to White patients. The models exacerbated data embedded inequalities significantly in 3 out of 8 assessments, one of which was >9 times worse

UCL Discovery

Edinburgh_UCL_Health@ SMM4H'22:From Glove to Flair for handling imbalanced healthcare corpora related to Adverse Drug Events, Change in medication and self-reporting vaccination

Author: Alex Beatrice
Guellil Imane
Sun Tony
Wu Honghan
Wu Jinge
Publication venue
Publication date: 01/10/2022
Field of study

This paper reports on the performance of Edin-burgh_UCL_Health’s models in the Social Media Mining for Health (SMM4H) 2022 shared tasks. Our team participated in the tasks related to the Identification of Adverse Drug Events (ADEs), the classification of change in medication (change-med) and the classification of selfreport of vaccination (self-vaccine). Our best performing models are based on DeepADEM-iner (with respective F1= 0.64, 0.62 and 0.39 for ADE identification), on a GloVe model trained on Twitter (with F1=0.11 for the changemed) and finally on a stack embedding including a layer of Glove embedding and two layers of Flair embedding (with F1= 0.77 for selfreport)

PubMed Central

UCL Discovery

Edinburgh Research Explorer

KnowLab at BioCreative VII Track 5 LitCovid: Ensemble of deep learning models from diverse sources for COVID-19 literature classification

Author: Casey Arlene
Dong Hang
Wang Minhong
Wu Honghan
Zhang Huayu
Publication venue
Publication date: 08/11/2021
Field of study

Edinburgh Research Explorer

Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation

Author: Dong Hang
Suarez Paniagua Victor
Whiteley William N
Wu Honghan
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Diagnostic or procedural coding of clinical notes aims to derive a coded summary of disease-related information about patients. Such coding is usually done manually in hospitals but could potentially be automated to improve the efficiency and accuracy of medical coding. Recent studies on deep learning for automated medical coding achieved promising performances. However, the explainability of these models is usually poor, preventing them to be used confidently in supporting clinical practice. Another limitation is that these models mostly assume independence among labels, ignoring the complex correlation among medical codes which can potentially be exploited to improve the performance. We propose a Hierarchical Label-wise Attention Network (HLAN), which aimed to interpret the model by quantifying importance (as attention weights) of words and sentences related to each of the labels. Secondly, we propose to enhance the major deep learning models with a label embedding (LE) initialisation approach, which learns a dense, continuous vector representation and then injects the representation into the final layers and the label-wise attention layers in the models. We evaluated the methods using three settings on the MIMIC-III discharge summaries: full codes, top-50 codes, and the UK NHS COVID-19 shielding codes. Experiments were conducted to compare HLAN and LE initialisation to the state-of-the-art neural network based methods. HLAN achieved the best Micro-level AUC and

F_1

on the top-50 code prediction and comparable results on the NHS COVID-19 shielding code prediction to other models. By highlighting the most salient words and sentences for each label, HLAN showed more meaningful and comprehensive model interpretation compared to its downgraded baselines and the CNN-based models. LE initialisation consistently boosted most deep learning models for automated medical coding.Comment: Accepted to Journal of Biomedical Informatics, structured abstract in full text, 21 pages, 5 figures, 4 supplementary materials (4 extra pages

arXiv.org e-Print Archive

UCL Discovery

Edinburgh Research Explorer

Oxford University Research Archive

Machine Learning to Classify Cardiotocography for Fetal Hypoxia Detection

Author: Francis Farah
Luz Saturnino
Stock Sarah S
Townsend Rosemary
Wu Honghan
Publication venue
Publication date: 11/12/2023
Field of study

Fetal hypoxia can cause damaging consequences on babies' such as stillbirth and cerebral palsy. Cardiotocography (CTG) has been used to detect intrapartum fetal hypoxia during labor. It is a non-invasive machine that measures the fetal heart rate and uterine contractions. Visual CTG suffers inconsistencies in interpretations among clinicians that can delay interventions. Machine learning (ML) showed potential in classifying abnormal CTG, allowing automatic interpretation. In the absence of a gold standard, researchers used various surrogate biomarkers to classify CTG, where some were clinically irrelevant. We proposed using Apgar scores as the surrogate benchmark of babies' ability to recover from birth. Apgar scores measure newborns' ability to recover from active uterine contraction, which measures appearance, pulse, grimace, activity and respiration. The higher the Apgar score, the healthier the baby is.We employ signal processing methods to pre-process and extract validated features of 552 raw CTG. We also included CTG-specific characteristics as outlined in the NICE guidelines. We employed ML techniques using 22 features and measured performances between ML classifiers. While we found that ML can distinguish CTG with low Apgar scores, results for the lowest Apgar scores, which are rare in the dataset we used, would benefit from more CTG data for better performance. We need an external dataset to validate our model for generalizability to ensure that it does not overfit a specific population.Clinical Relevance- This study demonstrated the potential of using a clinically relevant benchmark for classifying CTG to allow automatic early detection of hypoxia to reduce decision-making time in maternity units.</p

Edinburgh Research Explorer